41 research outputs found

    Efficient Instruction and Data Caching for High Performance Embedded Processors

    Get PDF
    In the last years, embedded systems have evolved so that they offer capabilities we could only find before in high performance systems. Portable devices already have multiprocessors on-chip (such as PowerPC 476FP or ARM Cortex A9 MP), usually multi-threaded, and a powerful multi-level cache memory hierarchy on-chip. As most of these systems are battery-powered, the power consumption becomes a critical issue. Achieving high performance and low power consumption is a high complexity challenge where some proposals have been already made. Suarez et al. proposed a new cache hierarchy on-chip, the LP-NUCA (Low Power NUCA), which is able to reduce the access latency taking advantage of NUCA (Non-Uniform Cache Architectures) properties. The key points are decoupling the functionality, and utilizing three specialized networks on-chip. This structure has been proved to be efficient for data hierarchies, achieving a good performance and reducing the energy consumption. On the other hand, instruction caches have different requirements and characteristics than data caches, contradicting the low-power embedded systems requirements, especially in SMT (simultaneous multi-threading) environments. We want to study the benefits of utilizing small tiled caches for the instruction hierarchy, so we propose a new design, ID-LP-NUCAs. Thus, we need to re-evaluate completely our previous design in terms of structure design, interconnection networks (including topologies, flow control and routing), content management (with special interest in hardware/software content allocation policies), and structure sharing. In CMP environments (chip multiprocessors) with parallel workloads, coherence plays an important role, and must be taken into consideration

    PeRISCVcope: a tiny teaching-oriented RISC-V interpreter

    Get PDF
    The fast advances of computer systems translate into a growing demand of methodologies and tools to introduce those novelties into classes. Among the plethora of those advances, virtualization has become an essential technology in almost every relevant system stack, from connected cars to hyperscaled cloud servers. However, introducing those technologies into the classroom remains a challenging task because of the huge complexity of their software components that may hinder the learning process of students. peRISCVcope aims to help in this area by proposing a tiny yet powerful interpreter to dig into virtualization technologies, such as the implementation of trap&emulate hypervisors. With less than 2,000 lines of code, and thanks to the conciseness of the RV32I base instruction set of RISC-V, peRISCVcope enables students to make virtualization knowledge their own. This paper presents our experiences developing and testing a virtualization laboratory where students implement parts of an interpreter. After the practical experience, peRISCVcope has been proved as a useful pedagogical tool, and, most importantly, students have positively rated the experience

    Cooperative CPU, GPU, and FPGA heterogeneous execution with EngineCL

    Get PDF
    Heterogeneous systems are the core architecture of most of the high-performance computing nodes, due to their excellent performance and energy efficiency. However, a key challenge that remains is programmability, specifically, releasing the programmer from the burden of managing data and devices with different architectures. To this end, we extend EngineCL to support FPGA devices. Based on OpenCL, EngineCL is a high-level framework providing load balancing among devices. Our proposal fully integrates FPGAs into the framework, enabling effective cooperation between CPU, GPU, and FPGA. With command overlapping and judicious data management, our work improves performance by up to 96% compared with single-device execution and delivers energy-delay gains of up to 37%. In addition, adopting FPGAs does not require programmers to make big changes in their applications because the extensions do not modify the user-facing interface of EngineCL

    Lightweight asynchronous scheduling in heterogeneous reconfigurable systems

    Get PDF
    The trend for heterogeneous embedded systems is the integration of accelerators and general-purpose CPU cores on the same die. In these integrated architectures, like the Zynq UltraScale+ board (CPU+FPGA) that we target in this work, hardware support for shared memory and low-overhead synchronization between the accelerator and the CPU cores make the case for exploring strategies that exploit a tight collaboration between the CPUs and the accelerator. In this paper we propose a novel lightweight scheduling strategy, FastFit, targeted to FPGA accelerators, and a new scheduler based on it, named MultiFastFit, which asynchronously tackles heterogeneous systems comprised of a variety of CPU cores and FPGA IPs. Our strategy significantly reduces the overhead to automatically compute the near-optimal chunksizes when compared to a previous state-of-the-art auto-tuned approach, which makes our approach more suitable for fine-grained applications. Additionally, our scheduler MultiFastFit has been designed to enable the efficient co-execution of work among compute devices in such a way that all the devices are busy while minimizing the load unbalance. Our approaches have been evaluated using four benchmarks carefully tuned for the low-power UltraScale+ platform. Our experiments demonstrate that the FastFit strategy always finds the near-optimal FPGA chunksize for any device configuration at a reasonable cost, even for fine-grained and irregular applications, and that heterogeneous CPU+FPGA co-executions that exploit all the compute devices are usually faster and more energy efficient than the CPU-only and FPGA-only executions. We have also compared MultiFastFit with other state-of-the-art scheduling strategies, finding that it outperforms other auto-tuned approach up to 2x and it achieves similar results to manually-tuned schedulers without requiring an offline search of the ideal CPU-FPGA partition or FPGA chunk granularity. © 2022 The Author

    Pasado y futuro de la infección por VIH. Un documento basado en la opinión de expertos

    Get PDF
    [EN] HIV infection is now almost 40 years old. In this time, along with the catastrophe and tragedy that it has entailed, it has also represented the capacity of modern society to take on a challenge of this magnitude and to transform an almost uniformly lethal disease into a chronic illness, compatible with a practically normal personal and relationship life. This anniversary seemed an ideal moment to pause and reflect on the future of HIV infection, the challenges that remain to be addressed and the prospects for the immediate future. This reflection has to go beyond merely technical approaches, by specialized professionals, to also address social and ethical aspects. For this reason, the Health Sciences Foundation convened a group of experts in different aspects of this disease to discuss a series of questions that seemed pertinent to all those present. Each question was presented by one of the participants and discussed by the group. The document we offer is the result of this reflection.[ES] La infección por VIH cumple ahora casi 40 años de existencia. En este tiempo, junto a la catástrofe y la tragedia que ha supuesto, ha representado también la capacidad de la sociedad moderna de asumir un reto de esta magnitud y de transformar, gracias al tratamiento antirretroviral, una enfermedad mayoritariamente letal en una enfermedad crónica, compatible con una vida personal y de relación prácticamente normales. Este aniversario parecía un momento idóneo para pararse a reflexionar sobre el futuro de la infección VIH, los retos que todavía quedan por abordar y las perspectivas para el inmediato futuro. Esa reflexión tiene que ir más allá de planteamientos meramente técnicos, de profesionales especializados, para abordar aspectos sociales y éticos. Por este motivo, la Fundación de Ciencias de la Salud convocó a un grupo de expertos en distintos aspectos de esta infección para discutir una serie de preguntas que parecieron pertinentes a todos los convocados. Cada pregunta era expuesta por uno de los participantes y discutida por el grupo. El documento que ofrecemos es el resultado de esa reflexión.For transparency purposes, we would like to inform you that GSK has contributed to the funding of this publicationPeer reviewe

    ANDES, the high resolution spectrograph for the ELT: science case, baseline design and path to construction

    Get PDF

    Search for Spatial Correlations of Neutrinos with Ultra-high-energy Cosmic Rays

    Get PDF
    For several decades, the origin of ultra-high-energy cosmic rays (UHECRs) has been an unsolved question of high-energy astrophysics. One approach for solving this puzzle is to correlate UHECRs with high-energy neutrinos, since neutrinos are a direct probe of hadronic interactions of cosmic rays and are not deflected by magnetic fields. In this paper, we present three different approaches for correlating the arrival directions of neutrinos with the arrival directions of UHECRs. The neutrino data are provided by the IceCube Neutrino Observatory and ANTARES, while the UHECR data with energies above ∼50 EeV are provided by the Pierre Auger Observatory and the Telescope Array. All experiments provide increased statistics and improved reconstructions with respect to our previous results reported in 2015. The first analysis uses a high-statistics neutrino sample optimized for point-source searches to search for excesses of neutrino clustering in the vicinity of UHECR directions. The second analysis searches for an excess of UHECRs in the direction of the highest-energy neutrinos. The third analysis searches for an excess of pairs of UHECRs and highest-energy neutrinos on different angular scales. None of the analyses have found a significant excess, and previously reported overfluctuations are reduced in significance. Based on these results, we further constrain the neutrino flux spatially correlated with UHECRs
    corecore